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Analysis of the Effects of Interpolation 
and Enhancement of LANDSAT-1 Data on 
Classification and Area Estimation 
Accuracy 


N. Chu 
C. McGillem 
P. Anuta 


I. Introduction 


Numerical classification of diqital multispectral scanner 
data from aircraft and satellite sensors using computer techniques 
is complicated by many factors. One of these is the effect of the 
finite instantaneous field of view of the scanning sensor which 
"blurs" or averages the signal from a finite area irto a single 
generated by the data system. For the Landsat-1 sensor the "blur" 
area is aporoximately an 80 meter diameter circle and for a typical 
aircraft scanner system the area may be a circle 10 meters or less 
in diameter. This finite area sample may contain "pure" or homo- 
geneous scene material or it may contain a mixture of two or more 
materials whose boundaries pass through the pixel area. For pixels 
covering homogeneous areas the finite pixel area causes little 
trouble and in fact classification may be improved due to the smooth- 
ing effect of gathering energy from the surrounding areas. For the 
overlap case, however, a contamination of pure spectral signatures 
results, causing difficulty in properly classifying the boundary 
pixels. The work reported here is a preliminary evaluation of the 
effects of a particular data enhancement approach aimed at improving 
classification performance in such cases. 

Some researchers have approached the boundary classification 
problem by attempting to model mixture spectra as linear combi- 
nations of pure spectra and in so doing attempt to determine the 
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fractional area in each pixel covered by each pure material. ' 

These approaches seek to analyze arid model mixture phenomena of 
each original pixel and have not proven particularly effective. The 
work reported here takes a different approach by attempting co 
improve the resolution of the image through use of special signal 


orocossing techniques. 


Since the size of the point spread function of the 
LANDSAT-1 flSS systom is fixed it is not possible to directly 
alter the area encompassed by one pixel of the systom output. 

It is possible, however, to carry out signal processing opera- 
tions utilizing weighted sums of surrounding pixels to generate 
now data points or to modify existing data points in a manner so 
as to reduce the fraction of the data that falls into the boun- 
dary category. 

One such method makes use of interpolation procedures tliat 
generate new [>oints between the original points. This is accom- 
plished by fitting a smooth surface to surrounding points and 
then computing inteirmediate points from the equation for the 
smooth surface.* This leads to a more gradual transition to tJie 
boundary and thus an increased likelihood that a portion of what 
was formerly the boundary will fall into one or the other of the 
classes on either side of the boundary. 

A more iv>werful method of reducing the effects of boundaries 
is through use of an image restoration filter designed to reduce 
the effective instantaneous field of view of the scanner."'*'* 

One such filter that has been developed for LANDSAT data prepro- 
cessing provides approximately a 65t reduction in the effective 
area of a single pixel of ERTS data while still controlling the 
noise and sidelobc levels in the resuJ.tant image. 

The restoration filter ixjrmits generation of new data points 
between original points which have ^ smaller instantaneous field 
of view as well as reducing the instantaneous field of view of 
the original points. Thus, the restoration filter method has 
the potential for increasing the effective resolution of the data 


and thereby reducinq the percontayo of overlap pixels occurring 
at boundaries relative to the total number in the scene. The problem 
of the overlao pixel is therefore attacked here by reducing the ef- 
fective size of the pixel rather than trying tc analyze the fractional 
components of the original pixelc. 

The results of classifying LANDSAT-1 MSS data after preproces- 
sing both by interpolation and by restoration filtering are de- 
scribed. In Section II, results arc presented for the straight- 
forward application of interpolation to typical farm land for the 
purpose of estimating crop acreages. No general improvement in ac- 
curacy is found to result fro i this procedure. In fact, although 
the results are mixed, there may be a slight reduction in average 
accuracy using this technique. These results are inconclusive due 
to the lack of training statistics, and clear knowledge of the 
placement of boundaries. Restoration filter preprocessing was not 
carried out for the crop classification experiment due to resource 
limitations. This is suggested for further work. 

Section III describes the application of interpolation and en- 
hancement techniques to estimation of the areas of lakes. Again, 
it is found the conventional processing of interpolated data 
using a single set of training areas gives no improvement in 
accuracy over uninterpolated data. However, by selecting special 
training areas from the lakes, it is found that a significant 
improvement in accuracy is obtained. When the enhancement pre- 
processing technique is employed a very marked improvement in 
accuracy is obtained and the results become very consistent. 

With this procedure the estimation error is reduced by a factor 
of two over that obtained with the unpreprocessed data. 


Section IV discusises certain oeculiarities of the analysis 
procedure used and suggests how further improvements might be 
made with both the interpolation and enhancement techniques. 

II. Crop Acreage Estimation. 

The area selected for analysis lies in DeKalb, Ogle and Lee 
Counties in northern Illinois.^ Ihese areas t're primarily farm- 
land and considerable ground truth is available for this region. 

The Landsat-1 MSS data for the area was collected on August 9, 1972 
(Scene No. 1017-16093) . 

An area of slightly more than 18,000 acres (128 x 128 pixels) 
was interpolated with a cubic polynomial (POLYINT) * to provide a 
4x4 enlargement (512 x 512 pixels) of the original data set. The in 
terpolated data set was then classified using standard procedures" 
and compared with classification of the non-interpolated data. The 
results are sho«n in Table 1. The classes considered are corn, soy- 
beans, and "other", consisting of all other materials found in the 
area such as alfalfa, oatu, pasture, trees, water, bare soil, etc. 

The training and test fields were selected from the imagery using 
ground truth and the boundaries were set so that the IFOV did not 
include border mixture pixels. Classification was carried out using 
the statistics of training sets taken from the interpolated data and 
also using the statistics of training sets taVien from the original 
(uninterpolated) data. It is seen that interpolation does not 
significantly change the classification accuracy. There is a slight 
increase in the average class accuracy and a slight decrease in 
the overall accuracy. 
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The original data set had spectral components with ampli- 
tudes in the range 16-40 out of the maximum possible dynamic range 
of 0-127. In order to see whether this limited dynamic range had 
adversely affected the interpolation process the dynamic range of 
the original data was uoubled by multiplying all amplitudes by a 
factor of two. The interpolation was then carried out on this new 
data set and classification carried out in the same manner as be- 
fore. The results are shown in Table 2 and are essentially the 
same as those obtained with the data having a more restricted dynamic 
range. 

Results for a different area are shown in Table 3. Again no 
appreciable changes in classification accuracy were obtained. 

From the above results it appears that there is no improve- 
ment in training and tost field performance using interpolated 
data and that there may in fact be a slight loss (1-2%) in 
accuracy. One possible explanation for this result is as follows. 
The interpolation procedure produces new points near a boundary 
that are different from the original boundary pixels and also 
different from the class within the boundary. However, the 
training and test areas are chosen completely from within the 
boundaries and therefore do not include any of these "inter- 
mediate” points. Thus the classifier rejects these points as 
being part of the class corresponding to the training class. 

As discussed in Section 3 it is likely that by expanding the 
training areas to include interpolated points near the boundary 
it may be possible to obtain significant improvement in per- 
formance . 



col, 1073-1328, run 72032803. The original data is dynam- 
ically doubled. Overflows are less than 0.01%. 











III. Estimation of Wator Acroago 


The accuracy of estimating water area by classification of 
CRTS MSS data has been studied previously by *3ai Lolucci . * Seven 
of the lakes used in this previous study were selected for ana- 
lysis. The areas of the lakes range from 15 to more than 1,800 
acres and thoir "true” areas are taken from USGS data. Surveys 
during the years 1969 to 1971 provide a reliable source of the 
actual average water area of these lakes in the month of May and 
these areas were taken to be the true values. The Landsat-1 data 
was gathered on May 4, 1973 (Scene No. 128515595) . 

Three types of data sets were analysed: original data; 4x4 

interpolated data (POLYINT) ; and 4x3 interpolated and enhanced 
data. For each of the chosen lakes, the surrounding land area 
was classified against the class water. A clustering routine 
was used as a guide to provide the training samples required by 
the classif ier . ‘ In general there are several classes existing 

between the lake water and the surrounaing land; e.g., water- 
land boundary, water-vegetative boundary, and shallow or muddy 
water. These classes can be investigated by studying their 
spectral signatures as required. This is discussed in detail by 
Bartolucci.’ There are two processes affecting results here. 

One is the existence of the several boundary classes (which in 
fact may be a continuous gradation from deep water to land cover) 
and the other is the effect of the instantaneous field of view. 
Thus, the situation is more complex than that for the crop field 
case where the boundary between fields is sharp relative to the 
instantaneous field of view of the scanner. 


It is found experimentally that selection of training areas 
strongly affects the classification accuracies obtained. As r.n 
attempt to reduce the variability produced by this subjective aspect 
of classification, it was decided to classify all of the chosen lake 
using the same set of training areas. The training set was selected 
from several lakes judged to have typical spectral characteristics. 
The results of this analysis using interpolated data are shown in 
Columns 1, 2 and 3 of Table 4. In Columns 1 and 2 the training 
set was selected from the original data while in Column 3, the 
training set was selected from the interpolated data. 

Since a single training set was used for all classifications 
it follows that if a particular lake has spectral characteristics 
that deviate significantly from the norm, then the results may 
prove less accurate than what is possible when the training sets 
are selected for each lake individually, c ^ imn 4 of Table 4 
shows the results obtained when the tra^’'i.ng areas were selected 
for each lake individually. 

Comparing the results for the original data (Column 1) with 
those for the corresponding interpolated data (Columns 2 and 3) 
shows a slight reduction (1-3%) in accuracy of the estimates of 
area. Note the errors are always on the low side and always are 
greater percentage-wise for smaller lakes than for larger lakes. 

This supports the assumption that the error is coming from the 
inability of the classifier to properly allocate the boundary 
points to the adjacent classes. In Column 4 where individual 
training sets for each lake are used and where all points interior 
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to the points classified as "boundary" are included there is a 
significant improvement in accuracy. This is very evident for 
the smaller lakes where the results are substantially better than 
for the original data. 

The analysis of data that was preprocessed with the enhance- 
ment algorithm® ie shown in Column 5 of Table 4. In this case 
only those points classified as "water" in the training set are 
includ < i in the area estimate. It is scon that improvement in 
the accuracy of the area estimate is present in every case. 

A comparison of the accuracy of the estimates of area as a 
function of size is given in Figure 1. In this figure data is 
shown for the original data (Column 1, Table 4), and the enhanced 
data (Column 5 of Table 4) . The ordinate in the figure is the 
percent of the estimate that must bo added to it to give the 
correct value. The most significant features evident in this 
figure arc the smooth behavior of the estimates obtained from 
the enhanced data and the erratic behavior for small lakes of 
the estimates based on the original data. There is clearly a sig- 
nificant improvement in the estimation procedure that results 
from using the enhanced data. If the results for the interpo- 
lated data (Column 4, Table 4) were plotted in Figure 1 they 
would fall between the curves for the original and enhanced data. 
However, the points would not fall on a smooth curve but would 
be somewhat oscillatory. 

IV. Discussion and Conclusions 


As discussed by Bartolucci* there are two basic approaches 
to water acreage estimation. The first approach is to classify 
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all the water against all other classes present. The number of 
points in the class water found by this procedure is then mul- 
tiplied by an appropriate scale factor to obtain the final acreage 
estimate. This is the method used in Columns 1, 2 , 3 and 5 of 
Table 4. For this procedure interpolation provides no improve- 
ment while enhancement provides a significant improvement. 

The second approach is to estimate a boundary and subwater 
classes near the boundary. W^\ich particular points fall in the 
subwater class is determined from the spectral characteristics 
of the clustered data. The subwater class points inside the 
boundary are then added to the water class points to give the 
total used in making the estimate. A typical set of cluster re- 
sults for interpolated data is shown in Table 5 and Figure 2 which 
corresponis to data for Rock Lake. It is seen that between the 
class wattr (symbol W) and land (symbol F) there are two distinct 
intermediate classes. These are designated the boundary (symbol B) 
and the subwater (symbol O) . If the basis of employing interpo- 
lation is that it reveals more details near the boundary then 
these classes correspond to that information and should be used 
to improve the estimation. It is this procedure that was used to 
produce the data of Column 4 in Table 4. Clearly the error of 
the estimate was reduced below tiat of the original data. How- 
ever, it is believed that further improvements can be made by more 
careful determination of the proper subwater class charocteristics 
and the number of such classes to utilize in the processing opera- 
tion. 

Whether the improved techniques using interpolated data will 
exceed the performance with enhanced data and whether use of sub- 
water classes with the enhanced data gives further improvement 


Piqura 2. 


Cluatar Results for Interpolated LANDSAT Data 
Rock Lake, ]0 Clusters, 4 Channels. 
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362 VVVVVVVllLLLLLLLVVVVVLLLLLlVVYYYl ll**4^»^*+»^4*44llLLLL 

363 VVVFFFYYYVVLLLLLLVVVVVLLLLLVVYll lll44444444444444llLLLL 

364 VVFFFFFFF YVVVLLLVVVVVVLLLLLLLVll 1114444444444444411LLLL 

365 VVFF6BB6FFFVVVVVVVVVVVVVLLLLLL11 IU4444444444444441LL L 

366 VYFFBBBBBOOFrFYYYYYYYYVVVVVLLl 11 1114 ♦♦♦♦♦♦♦♦♦♦♦♦♦! ICLLL 

367 YYYFFBBBBeBBTBBFFFFFYYYYYVVVVlll ll444*4444^44444tliilLL 

368 YYYYFPBB0C000000BBBFFFYYYYYYYYYYll444t44444+t44^lIII 111 

369 YYYYYFF0OOhV<WWWWOO8BFFFFYYYYYYYYm44^4>^4 ♦♦♦♦♦111 II 1 1 1 

370 YYYYYYFBOOWWWWWWWOOBBBFFFrFFFFYYYYll4^^^^^^^^llllll 1 llY 

3 71 YYYYlYFBCC0WV<HWWW0000BBnFFFFFFFFFYYYll4^^^^lllllllllllY 

372 Y YYYlYYFBUOWtaWWWHWOOOOBL0BBFFFFFFFFYYYll 1 11 lYYYYll 1 1 1 lY 

373 YYYlllYFFBOOWWUkhUWWOOOOOBBBFBBBBFFFFYYYl lYYYYYYll 11 1 lY 

374 YYVl ll YYFBOOWWV*WWV»V»WWCOCOBBBBBBBBBBFFYYY1YYYYYYYY1 lljlV 

375 YYYlllYYFOQOWWWWhWV^WWWOOUOCOOOCOQBBBFFYYllYYYVVVVVVVllY 

376 Yin 1 1 YYFBBOWWWWWWWKhWWOOCCOCOCOOOBBFF YY l 1 YYYVVVVVVVVVV 

377 I 1 1 1 1 1 YYYFBOOWV<UWWUV»WVtV<V.hKWWk«HWWWOOBBF YYl YYYYVVVVVVLLVV 

378 11 1 nil YYYFBBOOWWKWWWhKWWWWWWWKWWOOBBBFYYYYYYYVVVVVLLVV 

3 79 1 nnilVVVYFBBOQWWViWMViWhtahhWWWVWWWOUBBOFFFFFFFYVVyVVLVV 

380 1 11 llLLLLVVVFFBOCWViMWkWWWhhWtaWWWWWCOCOeBOBBQFFFYYVVVVVV 

381 LLLLLLLLLLL VVFFBOOhWMViUhtaWWMWViUWWMWQCCCOOOBBBeFFYYYVVVV 

382 LLLLLLLLLLLLVVFFBaGOMKWWWWWMWMWWkMkWWOCOOGOOOBBBFFYYVVV 

363 LLLLLLlLLLLLLVVFFBBOOV^MUhht^MWWWWWWViMMWMWCCOODCBBBFFYVVV 

384 LLLLLLLLLLLLL VVVYFBBnOhlKWt«V«VlkV4k«WWkUVikV<V«WWkiHMOCOGeBFFYVV 

385 LLLLLLLLLLLLVVVVFFBBUOVihUkMk^VtWWWWkWViWVitaWWMMWWCCOBBFYYV 

386 LLLLVLLLLLLLLL VVVVFFBBOOhWkMMUkWMV«ViMWWV<WMWWMWWOOBBFFYVV 

387 LLVVVVVLLLLLLVVVVVFFFBBOQViVtWKWVtWWhVtWVtWWkWWUWWCUBBBFYYVV 

388 VVVYYYVVLLLLVVVVVFFFFFBBOOWWKWWWWV,V,kUk«WkWUWWWCQBBFYYin 

389 VVYYYYYVLLLVVVVVFFFFFFFBBOOWV*WV*WWKWKWWliWWWWWOO0BFYYn n 

390 lYYYYYYVVVVVVV VVVVVVVFFFBBOOKWWWV<V*VtWHWV«V.WWWWOGBBFYYnn 

391 nYYYYYYVVVVVVVVVVVVVVVFFFBBOakWWViWWWKV.KHWWWUOBBBYYn 1 1 

392 1 niYYYVVVVVVVVVVVVVVVVVVFFFHOOWWViWWWWViWWWWWWOOBBYYnn 

393 111 llYYYVVVVVVVVVVLVLLLVLVVFFBOWWVikWWWWWWWWWOQGPBFYn 11 

39<* 1 11 111 YYYYVVVVVVLLLLLLLLLLVVFHUWViUhWWViktaWWWWWCCOBYYnn 

395 llllllYYYYVVVVVVLLLLL L VVF BUM k Wkkk kkk WWUk WCBBF Y Y 1 11 1 

396 111 lllYYYYVVVVVLLLL L VVF BWWkkWkWkkWkWWOOUFFYY 11 11 

397 inilllYYYVVVVVLLL L V VF BO W WkW WkkkkkWOUBBF YY Y 11 1 1 

398 1111111 YYYVVVVLLL LLVVFBOkkWkkkkkWOUBBFYVV V 11 11 

399 1 l4^^nYYVVVVLLL LLVYF BCOkkkkWCCOBBFYVVVVll 1 1 

^00 4^^^^n YYVVVLLL LLL VYFFBOOWkOtJUOFFVVVLLL llYY 

401 4^^^^niVVVVLLL LLLVVFFBOOGObBFFVVLLLLLlllY 

402 4^^^^niVVVLLL LLL VVFFFFFFFFVVVLLLLLL 11 lY 

403 4^^^^niVVVLL LLLLVVYVVVVVVVVLLLLLLLLlllY 

404 4^^^^niVVVLL LLLLLLLLLLLLLLlL LLLLLLlll 
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CLUSTER 

1 c ^ 

POINTS 

MEANS 
CM 11 

CM 2) 

CH( 3) 

CHI 4] 

113 

28.52 

20. 18 

59.42 

38.05 

2 S'" X 

155 

37.33 

40.46 

43.99 

21.95 

3 Cl> 

284 

32.15 

28.93 

43.98 

24.15 

4 (L) 

271 

J8.49 

22.04 

50.04 

30.65 

5 (V) 

325 

26.96 

20.47 

43.72 

25.28 

6 C 'f ^ 

250 

29.09 

24.07 

38.96 

21.85 

7 (M 

180 

26.35 

19.77 

33.80 

16.00 

8 CR) 

169 

2 5.84 

19.40 

27.01 

13.05 

9 CC> 

180 

25.28 

18.51 

20.41 

7.97 

10 

438 

24.45 

18.01 

15.04 

4.26 


CLUSTER VARIANCES 



CHI 1) 

CHI 2) 

CHI 3) 

CHI 4) 

1 

2.02 

4.18 

6.85 

5.60 

2 

2.63 

11.32 

7.09 

3.94 

3 

2.27 

7.08 

5.96 

3.92 

4 

2.61 

5.45 

7.68 

7.42 

5 

3.06 

4.06 

4.56 

2.73 

6 

2.04 

4.65 

3.81 

2.02 

7 

2.63 

6.41 

5.04 

3.36 

8 

2.46 

6.02 

4.02 

2.71 

9 

1.56 

3.65 

3.26 

2.24 

10 

3.04 

3.88 

3.04 

C.93 


TABLE 5. Mean Vector and Covariance Matrix of the 10 Classes 
of Lake Rock. 
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are yet to bo determined. 

The crop classification experiment did not include boundary 
pixels in the tests and did not use resolution enhanced data. 
Both these elements should be included in future studies to ex- 
plore the full value of the preprocessing techniques. 
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